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Abstract 

The  Saliency  Network  proposed  by  Shashua  and  Ullman  [20]  is  a  well-known  approach  to  the  problem  of 
extracting  salient  curves  from  images  while  performing  gap  completion.  This  paper  analyzes  the  Saliency 
Network.  The  Saliency  Network  is  attractive  for  several  reasons.  First,  the  network  generally  prefers  long 
and  smooth  curves  over  short  or  wiggly  ones.  While  computing  saliencies,  the  network  also  fills  in  gaps 
with  smooth  completions  and  tolerates  noise.  Finally,  the  network  is  locally  connected,  and  its  size  is 
proportional  to  the  size  of  the  image. 

Nevertheless,  our  analysis  reveals  certain  weaknesses  with  the  method.  In  particular,  we  show  cases  in 
which  the  most  salient  element  does  not  lie  on  the  perceptually  most  salient  curve.  Furthermore,  in  some 
cases  the  saliency  measure  changes  its  preferences  when  curves  are  scaled  uniformly.  Also,  we  show  that 
for  certain  fragmented  curves  the  measure  prefers  large  gaps  over  a  few  small  gaps  of  the  same  total 
size.  In  addition,  we  analyze  the  time  complexity  required  by  the  method.  We  show  that  the  number  of 
steps  required  for  convergence  in  serial  implementations  is  quadratic  in  the  size  of  the  network,  and  in 
parallel  implementations  is  linear  in  the  size  of  the  network.  We  discuss  problems  due  to  coarse  sampling 
of  the  range  of  possible  orientations.  We  show  that  with  proper  sampling  the  complexity  of  the  network 
becomes  cubic  in  the  size  of  the  network.  Finally,  we  consider  the  possibility  of  using  the  Saliency  Network 
for  grouping.  We  show  that  the  Saliency  Network  recovers  the  most  salient  curve  efficiently,  but  it  has 
problems  with  identifying  any  salient  curve  other  than  the  most  salient  one. 
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1  Introduction 

In  line  drawings,  certain  shapes  attract  onr  attention 
more  than  others.  For  example,  these  shapes  may  be  the 
ones  that  are  smooth,  long,  and  closed  (see  for  example 
Fig.  1).  Shashna  and  Ullman  [20]  proposed  a  method, 
which  attracted  considerable  attention,  to  extract  snch 
shapes  from  a  line  drawing.  They  defined  a  fnnction  that 
evalnates  the  “saliency”  of  a  cnrve.  Their  fnnction  has 
the  following  properties.  First,  when  all  other  parame¬ 
ters  are  held  constant,  it  monotonically  increases  with 
the  length  of  the  evalnated  cnrve.  In  addition,  it  de¬ 
creases  monotonically  with  the  energy  (the  total  sqnared 
cnrvatnre)  of  the  cnrve.  Thirdly,  the  fnnction  evalnates 
fragmented  cnrves,  in  which  case  it  penalizes  according 
to  the  amonnt  of  fragmentation.  Finally,  the  saliency 
measnre  is  the  snm  of  contribntions  from  different  sec¬ 
tions  of  the  cnrve,  where  these  contribntions  decay  with 
the  sections’  accnmnlated  energy  and  gap  length  from 
the  beginning  of  the  cnrve.  Using  this  saliency  fnnc¬ 
tion,  Shashna  and  Ullman  defined  the  “saliency  map”  of 
an  image  to  be  an  image  in  which  the  intensity  valne  of 
each  pixel  is  proportional  to  the  score  of  the  most  salient 
cnrve  emanating  from  that  pixel. 

A  network  of  locally  connected  elements  (the  Saliency 
Network)  was  proposed  for  compnting  the  saliency  map. 
The  Saliency  Network’s  compntation  involves  local  inter¬ 
actions  between  image  locations,  and  its  size  is  propor¬ 
tional  to  the  size  of  the  image.  The  network  implements 
a  relaxation  process  that  optimizes  the  saliency  measnre. 
As  a  conseqnence  of  the  optimization,  the  network  can 
identify  the  most  salient  location  in  the  image,  which 
conld  lie  on  either  an  open  or  closed  cnrve.  Addition¬ 
ally,  the  method  attempts  to  fill  in  gaps  smoothly  while 
simnltaneonsly  overcoming  noise. 

The  problem  of  marking  salient  locations  in  images 
(“attention”)  is  also  addressed  in  the  work  of  Gny  and 
Medioni  [7]  and  Snbirana  and  Snng  [21,  22].  Snbirana 
and  Snng  extend  Shashna  and  Ullman’s  method  to  find 
skeletons  of  regions.  Using  a  different  method  from 
Shashna  and  Ullman’s,  Gny  and  Medioni  also  prodnce  a 
saliency  map  from  an  edge  image.  In  Gny  and  Medioni’s 
scheme,  each  point  in  the  image  receives  a  saliency  valne 
eqnal  to  a  weighted  snm  of  contribntions  from  the  indi- 
vidnal  edge  elements.  The  weight  assigned  to  an  element 
is  based  on  a  circnlar-arc  completion  between  it  and  the 
image  point;  the  weight  decreases  with  the  total  cnrva¬ 
tnre  of  the  arc,  preferring  straighter  and  shorter  comple¬ 
tions.  Unlike  Shashna  and  Ullman,  however,  there  is  no 
attempt  to  optimize  a  measnre  of  saliency  over  the  set 
of  image  cnrves. 

Identifying  salient  strnctnres  in  images  is  one  of  the 
objectives  of  perceptual  grouping.  By  perceptnal  gronp- 
ing,  we  refer  to  the  (bottom-np)  process  of  gronping  to¬ 
gether  strnctnres  in  the  image  that  are  likely  to  belong  to 
a  single  object.  Other  tasks  in  perceptnal  gronping  are 
image  segmentation  and  gap  completion.  For  instance, 
[9,  11,  12,  13,  14,  15,  17,  18,  25,  26]  extract  contonrs 
from  the  image  according  to  certain  optimization  crite¬ 
ria,  [23,  19,  1,  10,  3]  compnte  optimal  cnrves  for  filling 
in  gaps,  and  [2,  24,  5,  6,  8,  16,  27]  identify  occlnded  and 
snbjective  contonrs. 


In  this  paper  we  provide  an  analysis  of  Shashna 
and  Ullman’s  method.  We  examine  both  the  measnre 
of  saliency  and  the  compntational  performance  of  the 
Saliency  Network.  Motivated  by  both  perceptnal  and 
compntational  reasons  we  identify  below  three  proper¬ 
ties  which  we  believe  a  measnre  of  saliency  shonld  sat¬ 
isfy.  We  then  analyze  Shashna  and  Ullman’s  measnre 
with  respect  to  these  properties.  The  properties  are: 

Fidelity.  To  be  consistent  with  examples  snch  as  Fig- 
nre  1,  the  saliency  map  shonld  highlight  the  loca¬ 
tions  in  the  image  that  lie  on  perceptnally  salient 
cnrves.  In  particnlar,  the  most  salient  location  in 
the  saliency  map  shonld  lie  on  the  most  perceptn¬ 
ally  salient  cnrve.  Thns,  for  example,  in  Fignre  1 
the  most  salient  location  in  the  saliency  map  shonld 
be  on  the  circle  rather  than  on  any  of  the  snrronnd- 
ing  line  segments. 

Invariance.  In  different  images  objects  often  appear 
in  different  positions  and  orientations  or  in  differ¬ 
ent  sizes.  A  saliency  measnre  for  cnrves  shonld  be 
insensitive  to  snch  variations.  In  particnlar,  the 
measnre  shonld  be  invariant  to  2D  rigid  transfor¬ 
mations  of  the  cnrve.  In  addition,  the  measnre 
shonld  be  consistent  over  different  scales.  That  is, 
given  two  cnrves  Fi  and  F2,  if  Fi  is  considered  more 
salient  than  F2,  then  Fi  shonld  remain  more  salient 
when  the  cnrves  are  scaled  nniformly. 

Performance  on  gaps.  In  Fignre  1  as  the  size  of  gaps 
between  edge  elements  is  increased,  onr  perception 
of  the  circle  fades.  We  therefore  expect  the  measnre 
of  saliency  to  degrade  with  gaps.  Fnrthermore,  we 
reqnire  a  saliency  measnre  to  penalize  large  gaps 
more  than  few  small  gaps  of  the  same  total  size. 
This  reqnirement  is  motivated  by  psychophysical 
stndies  performed  by  Elder  and  Zncker  [4]  which 
demonstrate  that,  when  a  fraction  of  the  bonndary 
of  an  object  is  missing,  hnmans’  recognition  ability 
is  hindered  more  when  the  missing  fraction  is  con¬ 
tained  all  in  one  gap  than  when  spread  over  several 
gaps. 

The  Saliency  Network  is  an  efficient  and  elegant 
method,  well  snited  to  locating  salient  strnctnres  in  im¬ 
ages.  However,  we  fonnd  cases  in  which  the  network 
violates  each  of  the  above  three  properties.  On  the  issne 
of  fidelity,  the  network  indeed  locates  the  perceptnally 
salient  cnrves,  so  that  long,  smooth,  closed  cnrves  are 
preferred  over  short,  wiggly,  open  ones.  Nonetheless, 
onr  analysis  reveals  cases  in  which  the  most  salient  lo¬ 
cation  in  the  saliency  map  is  not  on  the  perceptnally 
most  salient  cnrve.  For  example,  if  there  are  short  line 
segments  tonching  a  salient  cnrve,  then  often  the  short 
segments  shall  be  jndged  more  salient  than  the  closed 
cnrve.  In  this  sitnation,  the  most  salient  location  in  the 
network  will  not  lie  on  the  closed  cnrve,  bnt  it  will  draw 
its  saliency  from  the  closed  cnrve. 

Since  the  saliency  measnre  depends  only  on  length 
and  cnrvatnre,  it  is  invariant  to  rigid  transformations. 
We  show,  however,  that  at  times  the  measnre  changes 
its  preferences  when  the  cnrves  are  scaled  nniformly.  For 
instance,  consider  a  straight  line  and  a  circle  of  the  same 


Figure  1:  A  fragmented  circle  in  the  middle  of  noise.  The  global  shape  of  the  circle  is  apparent. 


length.  For  lengths  less  than  a  certain  value,  the  line  is 
preferred  over  the  circle,  whereas  for  larger  lengths  this 
preference  reverses.  Shashua  and  UllmanA  rankings  of 
curves,  therefore,  are  not  invariant  to  uniform  scaling  of 
the  image. 

Finally,  the  saliency  measure  can  be  applied  to  frag¬ 
mented  curves,  in  which  case  it  will  attenuate  with  gap 
length.  However,  our  analysis  indicates  that,  when  cir¬ 
cles  of  both  the  same  size  and  gap  length  are  compared, 
the  measure  prefers  a  circle  with  one  long  gap  over  a 
circle  with  few  small  gaps  of  the  same  total  size. 

In  addition  to  studying  properties  of  the  saliency  mea¬ 
sure,  we  also  examine  the  computational  properties  of 
the  Saliency  Network.  In  particular,  we  analyze  the  con¬ 
vergence  rate  of  the  network  and  show  that  the  run-time 
complexity  of  the  network  in  serial  implementations  is 
quadratic  in  the  number  of  elements.  We  then  discuss 
problems  due  to  coarse  sampling  of  the  range  of  possible 
orientations.  We  show  that,  when  the  range  of  possible 
orientations  is  sampled  too  coarsely,  undesirable  effects 
may  occur  in  which  corners  are  preferred  over  straight 
lines.  With  proper  sampling  the  complexity  of  the  net¬ 
work  becomes  cubic  in  the  size  of  the  image. 

Finally,  we  consider  the  possibility  of  using  the 
Saliency  Network  for  grouping.  We  note  that,  in  con¬ 
trast  to  other  existing  methods  for  grouping  that  search 
over  the  exponentially  large  space  of  all  possible  image 
curves  (e.g.,  [9,  11,  17,  26]),  the  Saliency  Network  re¬ 
covers  the  most  salient  curve  in  time  complexity  that  is 
polynomial  in  the  size  of  the  image.  However,  the  net¬ 
work  must  take  a  single  choice  at  every  junction,  and 
as  a  consequence  has  problems  with  identifying  salient 
curves  other  than  the  most  salient  one. 

The  paper  proceeds  as  follows.  Section  2  contains  def¬ 
initions.  Section  3  includes  an  analysis  of  the  different 
properties  of  the  saliency  measure.  Section  4  analyzes 
the  time  complexity  of  the  network  computation.  Sec¬ 
tion  5  analyzes  the  effects  of  sampling  on  the  computa¬ 
tion.  Finally,  Section  6  discusses  the  issue  of  using  the 
output  of  the  network  for  grouping. 


2  Definitions 

Shashua  and  Ullman  defined  their  saliency  measure  as 
follows.  For  every  pixel  in  the  image,  there  is  a  fixed  set 
of  “orientation  elements”  connecting  the  pixel  to  neigh¬ 
boring  pixels  (Fig.  2-left).  Each  orientation  element  is 
called  “actual”  or  “real”  if  it  lies  on  an  edge  in  the  under¬ 
lying  image,  and  otherwise  it  is  called  “virtual”  or  “gap” 
(see  Fig.  3).  Given  a  curve  F  composed  of  the  TV  -h  1 
orientation  elements  ...,pi^]y  (Fig.  2-right),  the 

saliency  of  F  is  defined  by 

i-\-N 

<J>(r)  =  EvPuO,,  (1) 

j=i 

with 

_  /  if  Ti  is  actual 

~  \  0,  if  Ti  is  virtual 

j 

Pij  —  Pk  —  P^^\ 

k—i 

where  pa  —  1  and  where  p  is  some  constant  in  the  range 
[0,  1).^  (Shashua  and  Ullman  set  p  to  0.7.)  Cj  ensures 
that  only  actual  elements  will  contribute  to  the  saliency 
measure,  gij  is  the  number  of  gap  elements  between  pi 
and  pj ,  and  pij  reduces  the  contribution  of  an  element 
according  to  the  total  length  of  the  gaps  up  to  that  ele¬ 
ment.  Further, 


where  k(s)  is  the  curvature  at  position  s.  Kij  reduces  the 
contribution  of  elements  according  to  the  accumulated 
squared  curvature  from  the  beginning  of  the  curve. 


^The  formula  for  pij  appeared  in  [20]  as  pij  =  pk, 

but  the  computation  actually  performed  by  the  network 
(which  is  given  by  Eq.  5)  implements  the  modihed  formula 
given  here. 


Figure  2:  Example  of  the  connectivity  of  Shashna  and  Ullman’s  Saliency  Network,  for  the  cases  of  sixteen  and  twenty-fonr 
orientation  elements  per  pixel.  In  the  left  pictnres,  the  neighbors  of  a  pixel  {x,  y)  are  +  y-\-Ay)\  max(|A2:|  ,  |A^|)  =  Ae}, 

where  Ae  =  2  for  16  elements  per  pixel  and  Ae  =  3  for  24  elements  per  pixel.  Given  the  pixel  neighborhoods  in  the  left 
pictnres,  the  right  pictnres  show  examples  of  hve-element  cnrves. 


The  saliency  of  an  element  pi  is  defined  to  be  the 
maximum  saliency  over  all  curves  emanating  from  pi : 

$(i)  =  max  $(r),  (2) 

rec(i) 

where  C(i)  denotes  the  set  of  curves  emanating  from  pi. 
Shashna  and  Ullman  showed  how  to  compute  $(i)  on  a 
network  of  locally  connected  elements.  Denote  by  $Ar(i) 
the  saliency  of  the  most  salient  curve  of  length  TV  +  1  or 
less  emanating  from  pi.  The  measure  $Ar(i)  satisfies 

<FAr(i)  =  max  (3) 

PjeM(i) 

where  M{i)  is  the  set  of  all  neighboring  elements  of  p*, 
and  where  F{)  is  a  function  of  and  constants 

stored  at  elements  pi  and  pj .  Shashna  and  Ullman  re¬ 
ferred  to  this  type  of  measure  as  “extensible.”^  In  the 
Saliency  Network, 

=  cr,-  F  piCij^N-l{j),  (4) 

which  gives 

<FAr(i)  =  cr*  +  Pz  max  Cij^N-i(j)-  (5) 

PjeJ^(i) 

Note  that  this  recurrence  relation  updates  each  ele- 
menUs  saliency  by  taking  a  maximum  over  its  neigh¬ 
bor’s,  but  does  not  allow  an  element  to  retain  its  current 
saliency.  This  observation  raises  the  question  of  whether 
the  saliencies  are  optimal  over  all  curves  that  are  less 
than  or  equal  to  TV  elements  long  or  only  over  curves 
that  are  exactly  TV  elements  long.  In  fact  the  former  is 
true,  which  we  now  show.  First,  note  that  the  saliency 

^Note  that  this  dehnition  of  extensibility  is  different  from 
that  nsed  by  Brady  et  al.  [1]. 


measure  in  Eq.  1  is  monotonically  non-decreasing  with 
the  number  of  elements  TV  on  a  curve.  Consequently,  at 
iteration  TV  -h  1  every  element  has  the  option  of  choosing 
the  same  neighbor  as  it  chose  at  iteration  TV,  and  thus 
obtain  a  new  saliency  that  is  no  less  than  its  current 
saliency.  Therefore,  it  is  sufficient  to  not  include  an  ele¬ 
ment’s  current  saliency  when  taking  the  maximum,  be¬ 
cause  there  will  be  at  least  one  neighbor  through  which 
the  element  can  obtain  a  new  saliency  that  is  as  great  as 
its  own. 

To  make  the  saliency  measure  $  independent  of  the 
particular  implementation,  we  introduce  a  continuous 
version  of  Given  a  curve  F(s)  of  length  /  (0  <  s  <  /, 
s  denotes  arc  length),  we  define  $  by 

<I>(r)  =  /  (T(s)p(0,s)C(0,s)rfs,  (6) 

Vo 

where 

J  1,  if  F(s)  is  actual 

1^  0,  if  F(s)  is  virtual 

p9(si,S2') 

^-K(si,S2) 

where  g{si,S2)  is  the  total  gap  length  of  F  between  si 
and  S2  and  K{si,  S2)  is  the  energy  of  the  curve  between 
Si  and  S2,  which  are  defined  by 

g{si,S2)  =  j  {l-(7{t))dt,  (7) 

J  Si 

K(si,S2)  =  f  Hp{t)dt.  (8) 

J  Si 

A  useful  tool  in  computing  saliencies  is  the  following 
rule.  Given  a  curve  F  which  is  composed  of  two  smoothly 


(t(s) 

P(S1,S2)  = 

C(S1,S2)  = 


Figure  3:  Left:  Input  image  is  a  binary  edge  map.  In  the  picture  the  black  squares  represent  edge  pixels.  Right:  The  Saliency 
Network  is  dehned  on  top  of  the  edge  map.  The  network  is  composed  of  locally  connected  elements  which  are  called  “active” 
if  they  lie  on  edges  and  “gaps”  if  they  do  not.  In  the  right  picture,  the  dashed  line  segments  between  eight-connected  pixels 
represent  active  elements,  and  the  remaining  line  segments  represent  gaps.  For  viewing  purposes,  every  element  was  set  to 
have  eight  neighbors,  although  in  Shashua  and  UIImanT  implementation  every  element  had  sixteen  neighbors,  and  in  our 
implementation  every  element  had  twenty-four  neighbors. 


concatenated  sections,  Fi  and  F2,  the  saliency  of  F  is 
given  by 

$(r)  =  $(ri)  +  /Ci)e--ff(ri)$(r2),  (9) 

where  5^(Fi)  is  the  total  gap  length  and  K(Ti)  is  the 
energy  of  F 1 . 

Over  the  remainder  of  the  paper,  we  will  present  ex¬ 
amples  of  the  Saliency  Network  on  simulated  and  real 
images.  Our  implementation  replicates  Shashua  and  Ull- 
man’s  original  implementation,  except  that  we  increased 
the  number  of  orientation  elements  per  pixel  to  obtain 
greater  accuracy.  We  used  twenty-four  orientation  ele¬ 
ments  per  pixel,  whereas  Shashua  and  Ullman  used  six¬ 
teen  elements  per  pixel.  Also  we  set  p  =  .7  as  in  the 
original  implementation. 


3  Properties  of  the  Saliency  Measure 

We  begin  our  analysis  by  examining  the  saliency  measure 
proposed  by  Shashua  and  Ullman.  Section  3.1  below  dis¬ 
cusses  the  treatment  of  cycles.  Section  3.2  analyzes  the 
behavior  of  the  measure  when  applied  to  simple  curves. 
Lastly,  Section  3.3  analyzes  the  behavior  of  the  measure 
when  applied  to  curves  that  include  gaps. 


3.1  Cycles 


The  measure  of  saliency  proposed  by  Shashua  and  Ull¬ 
man  is  a  positive  function  that  increases  monotonically 
with  the  lengths  of  the  curves  in  the  image.  Closed 
curves  (cycles)  are  considered  to  have  infinite  length, 
even  though  they  form  finite  structures  in  the  im¬ 
age.  Shashua  and  Ullman  showed  that  their  network 
is  guaranteed  to  converge  when  applied  to  closed  curves. 
The  reason  it  converges  is  that  the  contribution  to  the 
saliency  from  remote  elements  attenuates  geometrically 
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with  the  curvature  accumulated  from  the  beginning  of 
the  curve.  In  cycles  this  generates  a  geometric  series 
that  converges  to  a  finite  value. 

Formally,  given  a  closed  curve  F,  denote  by  $  the 
saliency  of  an  element  of  F  that  is  obtained  by  starting 
from  that  element  and  then  proceeding  once  around  the 
curve.  Denote  by  K  the  total  squared  curvature  of  the 
cycle  and  by  g  the  cycle’s  total  gap  length.  Then  by 
repeatedly  applying  Eq.  9  we  obtain 

$(r)  =  $  +  +  ... 

l-pSe-K  ■ 

When  the  network  is  applied  to  an  open  curve,  after 
going  once  along  the  curve  it  is  possible  for  the  network 
to  then  take  a  180°  turn  and  walk  back  along  the  curve. 
The  saliency  of  the  resulting  closed  curve  could  be  con¬ 
sidered  to  be  the  saliency  of  the  open  curve  at  conver¬ 
gence.  As  we  shall  see  next,  the  attenuation  due  to  the 
180°  is  so  high  that  the  additional  score  is  negligible.  Let 
F  be  an  open  curve.  Let  and  be  the  saliencies  of  F 
measured  by  going  once  along  the  curve  in  the  forward 
and  backward  directions,  respectively.  Denote  by  K  the 
total  squared  curvature  of  F.  Then  the  saliency  of  F  is 
given  by 


$(0  =  $;  + 


g-2if-27r=<j,^ 


_  $/  +  e~^~^ 

~  ^  _  g-2K-27r2 

If  F  is  symmetric  then  and  we  obtain 


$(0  = 


1  -  .000051723  6- 


The  largest  increase  in  saliency  is  obtained  for  a  straight 
line  (A"  =  0),  where  the  saliency  becomes 

$(r)  =  1.000051725$/.  (13) 

One  can  see  that  the  additional  saliency  obtained  by 
wrapping  aronnd  an  open  cnrve  is  very  small  and  prac¬ 
tically  can  be  ignored.  As  a  conseqnence,  the  network 
is  likely  to  prefer  connecting  the  cnrve  throngh  gaps  to 
other  cnrves,  or  even  aronnd  to  itself,  since  snch  connec¬ 
tions  often  will  resnlt  in  higher  saliencies. 

3.2  Straight  lines  and  circles 

In  this  section  we  compnte  the  saliencies  of  a  few  simple 
cnrves.  We  then  nse  these  simple  cnrves  to  examine 
the  issnes  of  fidelity  and  invariance.  In  general,  we  will 
only  be  interested  in  the  measnre  of  saliency  obtained 
for  the  most  salient  element  of  the  cnrve.  Thronghont 
this  section  we  shall  nse  the  continnons  definition  of  the 
saliency  measnre  (Eq.  6).  We  consider  only  cnrves  with 
no  gaps  (we  will  analyze  cnrves  with  gaps  in  Section  3.3); 
hence  (j(s)  =  1  and  p(0,  s)  =  1  for  all  s.  Eq.  6  therefore 
becomes 

$(r)=  [  Ci0,s)ds,  (14) 
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where 

C7(0,s)  =  e-/o’"'W''*. 

The  examples  below  demonstrate  some  of  the  prob¬ 
lems  with  Shashna  and  Ullman’s  saliency  measnre.  In 
particnlar,  we  compare  the  saliency  of  a  line  segment  of 
length  !  to  that  of  a  circle  of  perimeter  /.  We  show  that 
for  small  valnes  of  /,  the  straight  line  is  preferred  over  the 
circle,  and  that  this  preference  reverses  for  large  valnes 
of  /.  The  saliency  fnnction,  therefore,  ranks  cnrves  differ¬ 
ently  when  these  cnrves  are  scaled  nniformly.  In  another 
example,  we  analyze  the  resnlts  of  applying  the  saliency 
measnre  to  a  pictnre  containing  a  circle  and  short  line 
segments  connected  to  it.  We  see  that  a  short  line  seg¬ 
ment  increases  its  saliency  valne  by  connecting  to  the 
circle.  As  a  resnlt  of  this  increase,  it  is  not  nnnsnal  for  a 
short  segment  to  become  more  salient  than  a  circle.  The 
saliency  of  the  short  line  segment  in  this  case  represents 
the  saliency  of  the  circle,  bnt  the  most  salient  element  is 
in  fact  not  part  of  the  circle. 

We  begin  by  deriving  explicit  formnlas  for  the  saliency 
of  straight  lines  and  cnrves.  Eor  straight  lines  C(0,  s)  =  1 
for  all  s.  Therefore,  ignoring  the  possibility  that  a  line 
can  wrap  aronnd  itself  (see  Section  3.1),  a  straight  line  of 
length  !  will  obtain  the  score  ^(E)  =  /.  The  saliency  of 
a  straight  line,  therefore,  grows  linearly  with  the  length 
of  the  line. 

Eor  a  circle  of  radins  r,  the  cnrvatnre  is  constant,  k  = 
1/r,  and  so  for  a  circnlar  arc  of  length  s, 

f7(0,s)  =  e  fo  = 

The  saliency  attribnted  for  the  circnlar  arc  is 

$(0,s)  =  f  C(0,t)dt 
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dt 


0 

2 


1  -  e  r  2 


(16) 


At  convergence  (s  =  oo),  the  saliency  of  the  circle  is 
given  by 


$(r)  =  lim  (  1  —  e 


(17) 


The  score  of  a  circle,  therefore,  grows  qnadratically  with 
the  radins  (and  thns  also  with  the  perimeter)  of  the  cir¬ 
cle. 

The  fact  that  the  saliency  of  a  straight  line  grows 
linearly  with  its  length,  whereas  the  saliency  of  a  cir¬ 
cle  grows  qnadratically  with  its  perimeter,  snggests  that 
the  network  may  treat  the  two  differently  when  they  are 
scaled.  Consider  a  straight  line  of  length  !  and  a  circle  of 
perimeter  !  =  27rr.  These  two  entities  will  have  exactly 
the  same  saliency  when  I  =  0  and  when  I  =  ^  39.48. 

(The  saliencies  in  the  two  cases  are  0  and  respec¬ 
tively.)  When  0  <  /  <  dvr^  the  line  will  be  more  salient 
than  the  circle,  whereas  when  I  >  dvr^  the  circle  will  be 
more  salient.  Eig.  4  shows  an  example  of  three  images, 
each  of  which  contains  a  straight  line  and  a  circle  of  the 
same  length.  Consistent  with  onr  analysis,  the  Saliency 
Network  fonnd  the  straight  line  to  be  more  salient  than 
the  circle  at  shorter  lengths,  and  fonnd  the  circle  to  be 
more  salient  at  longer  lengths. 

A  different  problem  is  enconntered  in  the  case  of  a 
circle  connected  to  short  line  segments.  Consider  the 
pictnre  in  Eig.  5-left.  The  circle  seems  to  be  the  most 
perceptnally  salient  cnrve  in  this  image.  Connterintn- 
itively,  the  most  salient  element  compnted  by  the  net¬ 
work  is  on  one  of  the  line  segments  connected  to  the 
circle,  thns  violating  the  fidelity  reqnirement.  The  rea¬ 
son  is  that  a  neighboring  line  segment  may  increase  its 
saliency  by  connecting  to  the  circle,  withont  affecting  the 
saliency  of  the  circle.  Consider,  for  example,  a  circnlar 
arc  of  length  1  and  cnrvatnre  k  connected  smoothly  to  a 
circle  of  radins  r  (which  corresponds  to  a  single  element 
connected  smoothly  to  the  circle  via  cnrvatnre  k).  Using 
Eq.  9  we  obtain  that  the  saliency  of  the  first  element  on 
the  arc  is 

=  $(r)  +  e-"'$„  (18) 

where  E  represents  the  circnlar  arc  and  is  the  saliency 
of  the  circle.  Now,  nsing  Eq.  15, 

$(r)=  /  C{0,s)ds=  - ^ - .  (19) 
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Combining  Eqs.  17-19,  we  obtain  that 

$e  =  - % - he“'‘VC  (20) 


If  we  now  compare  the  saliency  of  the  element,  fo 
that  of  the  circle  (Efi-  17),  we  obtain  that  > 

when 


"  2  ^  2 
r  >  r  , 


(21) 


SO  that 


5 


|k|  <  Kc, 


(22) 


o 


O 


O 


Figure  4:  Lack  of  scale  invariance  in  the  Saliency  Network.  Top  fignres:  three  images  that  contain  a  straight  line  and  a  circle 
of  ronghly  the  same  length.  Bottom  hgnres:  the  most  salient  cnrves  that  were  fonnd  in  these  images.  Lengths  are  27  (left), 
39  (middle),  and  84  (right).  The  saliency  valnes  obtained  for  the  circles  are  15.39  (left),  33.60  (middle),  and  132.05  (right), 
and  for  the  lines  are  27.06  (left),  39.00  (middle),  and  84.00  (right). 


where  =  1  /r.  That  is,  the  element  will  be  more  salient 
than  the  circle  if  and  only  if  it  connects  to  the  circle  at 
a  curvature  that  is  less  than  the  curvature  of  the  cir¬ 
cle.  This  is  consistent  with  the  network’s  preference  for 
straight  curves.  Notice  that  if  the  element  is  a  line  tan¬ 
gential  to  the  curve  (k  =  0)  the  element  will  be  more 
salient  than  the  circle  regardless  of  the  circle’s  radius. 

This  phenomenon,  that  curves  connecting  to  a  circle 
may  increase  their  saliencies  due  to  these  connections 
and  actually  beat  the  circle,  is  more  likely  to  occur  for 
longer  curves.  Suppose  a  curve  T  connects  to  a  circle 
C  such  that  the  total  squared  curvature  of  T,  including 
the  connection  point,  is  K.  Then  the  saliency  of  the 
element  on  T  that  is  most  distant  from  the  circle  is  given 
by  Eq.  18,  where  is  replaced  by  A",  namely, 

=  $(r)  +  e-^$e.  (23) 

The  longer  T  is,  the  more  likely  it  is  to  become  more 
salient  than  the  circle.  Suppose  for  example  that  T  is  a 
straight  line  of  length  !  that  connects  to  the  circle  via 
curvature  k.  We  have  that  ^(F)  =  I  and  K  =  which 
implies 


Now  when 

(24) 

1  -h  e~^  P  >  r^. 

(25) 

Substituting  r  =  I/k^,  this  implies  that 

2  1  — 

^  i’ 

(26) 

when  is  small,  or 

K 

(27) 

Clearly,  the  longer  the  line  is,  the  more  likely  it  is  to 
become  more  salient  than  the  circle. 

Fig.  5-left  shows  a  picture  of  a  circle  with  a  few  short 
curves  connected  to  it.  When  the  Saliency  Network  is 
applied  to  this  picture,  the  most  salient  element  does  not 


lie  on  the  circle,  although  most  of  its  saliency  is  due  to 
the  circle.  Indeed,  if  we  disconnect  these  short  curves 
from  the  circle,  then  the  circle  becomes  the  most  salient 
structure  in  the  image. 

3.3  Curves  with  gaps 

One  of  the  most  important  properties  of  Shashua  and 
Ullman’s  saliency  network  is  its  ability  to  fill  in  gaps 
while  computing  the  saliencies.  The  network  handles 
gaps  by  using  virtual  elements,  which  compute  the 
saliencies  of  curves  emanating  from  their  locations  and 
transfer  these  saliencies  to  their  neighboring  elements. 
Via  these  transfers,  actual  elements  evaluate  the  salien¬ 
cies  of  curves  that  emanate  from  their  locations  and  con¬ 
tain  any  number  of  gaps.  The  network  avoids  curves  with 
large  gaps  by  attenuating  the  scores  of  curves  exponen¬ 
tially  with  gap  size. 

In  this  section  we  analyze  the  performance  of  the 
saliency  network  in  the  presence  of  gaps.  Due  to  the 
saliency  measure  attenuating  exponentially  with  gap 
size,  the  network  is  capable  of  overcoming  small  gaps, 
but  is  unlikely  to  overcome  large  ones.  As  an  example, 
consider  the  problem  mentioned  in  Section  3.2,  that  a 
short  line  segment  in  the  neighborhood  of  a  circle  may 
increase  its  saliency  by  connecting  to  the  circle.  One 
consequence  of  the  fast  attenuation  is  that  this  problem 
almost  disappears  when  the  segment  is  not  physically 
connected  to  the  circle.  On  the  other  hand,  we  show  be¬ 
low  that,  due  to  the  exponential  decay,  very  long  struc¬ 
tures  (straight  lines  and  circles)  obtain  very  low  scores 
even  when  only  a  small  fraction  of  the  curves  are  gaps. 

Finally,  we  explore  the  question  of  whether  the 
network  prefers  fragmented  curves  (dashed  lines)  over 
curves  with  single  gaps  of  the  same  total  size.  At  first 
glance  Shashua  and  Ullman’s  saliency  measure  appears 
indifferent  to  this  property,  because  the  total  size  of  gaps 
is  taken  into  account,  irrespective  of  the  fragmentation. 
In  fact,  for  open  curves  there  is  no  clear  preference  be¬ 
tween  a  curve  having  many  small  gaps  or  a  few  long  gaps. 
For  closed  curves,  however,  we  show  that  a  curve  with 


Figure  5:  An  example  of  a  circle  with  a  few  short  curves  connecting  to  it.  The  most  salient  element  (for  which  T  =  136.63) 
was  not  on  the  circle,  although  its  saliency  came  mostly  from  the  circle  (the  saliency  of  the  circle  is  130.74).  If  short  gaps 
were  added  between  the  curves  and  the  circle,  the  circle  would  become  the  most  salient  curve  in  the  image. 


a  single  large  gap  is  preferred  over  the  same  curve  with 
several  small  gaps  of  the  same  total  size;  this  preference 
is  inconsistent  with  the  psychophysical  experiments  of 
Elder  and  Zucker  [4]. 

In  computing  the  saliency  of  a  fragmented  curve,  gaps 
affect  the  total  score  in  two  ways  (see  Eq.  1).  First,  gap 
elements  themselves  do  not  contribute  at  all  to  the  total 
score  (since  (Tj  —  0  for  virtual  elements).  Secondly,  the 
actual  elements  of  the  curve  that  lie  on  the  other  side  of 
a  gap  are  attenuated  by  a  factor  where  g  is  the  total 
gap  length  accumulated  from  the  beginning  of  the  curve. 
Consider  for  example  a  curve  E  with  one  gap  of  length 
g.  Denote  the  first  part  of  the  curve  (before  the  gap)  by 
El  and  the  second  part  of  the  curve  (after  the  gap)  by 
E2,  and  denote  by  /E(0,m)  the  total  squared  curvature 
of  El  plus  the  gap.  The  saliency  of  E  is  given  by  (Eq.  9) 

$(r)  =  $(ri) +  /e--^c.™)$(r2)  (28) 

From  this  formula,  Ei  contributes  to  the  saliency  of  E 
as  if  there  were  no  gap,  the  gap  elements  contribute 
nothing,  and  the  contribution  of  E2  is  attenuated  by 

.  Clearly,  the  longer  the  portion  of  E  before  the  gap 
(El),  the  less  the  saliency  of  E  will  be  attenuated.  If  the 
gap  appears  near  the  end  of  the  curve  the  saliency  of  E 
is  hardly  attenuated.  If  the  gap  appears  at  the  begin¬ 
ning,  the  entire  saliency  of  E  is  attenuated  by  the  factor 

.  Notice  that  since  the  network  evaluates  open  curves 
starting  from  both  endpoints,  if  a  curve  contains  a  rela¬ 
tively  smooth  section  on  one  of  its  sides  and  a  relatively 
wiggly  section  on  its  other  side,  then  the  highest  score 
will  be  obtained  when  the  gaps  are  distributed  along  the 
wiggly  side. 

Consider  now  a  straight  line  E  with  gaps  distributed 
uniformly  along  the  line.  Let  p  (0  <  p  <  1)  be  the 
fraction  of  the  line  containing  the  actual  elements,  and 
let  q  =  1  —  p  he  the  fraction  of  the  line  which  is  virtual. 
We  can  thus  set  (j(s)  =  p.  The  gap  length  ^  of  a  line 
segment  of  length  I  is  given  by  ql.  Since  we  are  dealing 
with  a  straight  line,  (C(0,  s)  =  1  for  all  s.  Consequently, 
the  expected  saliency  of  a  straight  line  of  length  I  with 


fraction  q  in  uniform  gaps  is  given  by  (Eq.  6) 

‘^ie)=pf  = -p— - 1) • 
Jo  9  In  p 

This  score  converges  as  !  approaches  infinity  to 

P 


=  -- 


qlnp 


(29) 

(30) 


Thus,  the  saliency  of  an  infinitely  long  straight  line  with 
uniformly  distributed  gaps  is  always  finite  and,  in  fact, 
proportional  to  p/q.  Note  that,  since  the  saliency  mea¬ 
sure  monotonically  increases  with  the  length  of  a  curve, 
the  score  of  an  infinitely  long  straight  line  with  uniform 
gaps  provides  an  upper  bound  on  the  score  of  any  finitely 
long  line  segment  with  the  same  distribution  of  gaps. 

Examples  for  the  values  assumed  by  $00  as  a  function 
of  p  and  p  are  given  in  Table  1.  Consider  p  —  0.7:  When 
95%  of  the  line  includes  actual  elements  (5%  gaps),  the 
score  is  only  53.27,  and  when  90%  of  the  line  includes 
actual  elements  (10%  gaps),  the  score  drops  to  25.23. 
This  means  that  a  straight  line  of  length  54  will  be  better 
than  any  line  that  contains  5%  gaps.  Similarly,  a  straight 
line  of  length  26  will  always  be  better  than  an  infinite 
line  with  10%  gaps. 

A  similar  analysis  can  be  performed  for  a  circle  with 
uniformly  distributed  gaps.  Unlike  the  infinite  straight 
line,  here  the  circle  has  finite  size.  Given  a  circle  with 
radius  r  and  fraction  p  actual  elements  and  q  =  1  —  p 
virtual  elements,  we  set  (j(s)  =  p  for  all  s,  q{h,s)  =  qs 
and,  using  Eq.  15,  (C(0,s)  =  .  Thus,  the  saliency 

of  E  is  given  by 


pQO  pQO 

$(E)=p/  p^^e-^ds  =  p  (31) 
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which,  since  qlnp  <  0,  simplifies  to 


$(0  = 


p 


72  -  qlrip 


(32) 


Examples  for  the  values  assumed  by  ^(E),  for  p  —  0.7, 
are  given  in  Table  2.  Similar  to  the  case  of  straight 


p\p 

0.1 

0.3 

0.5 

0.7 

0.9 

0.5 

0.43 

0.83 

1.44 

2.80 

9.49 

0.7 

1.01 

1.94 

3.37 

6.54 

22.15 

0.9 

3.91 

7.48 

12.98 

25.23 

85.42 

0.93 

5.77 

11.03 

19.17 

37.25 

126.10 

0.95 

8.25 

15.78 

27.41 

53.27 

180.33 

0.97 

14.04 

26.86 

46.65 

90.65 

306.88 

0.99 

43.00 

82.23 

142.83 

277.56 

939.63 

1 

00 

00 

00 

00 

00 

Table  1:  Too  for  a  straight  infinite  line  with  nniform  gaps  as 
a  fnnction  of  p  and  p.  Note  that  the  score  for  inhnite  lines 
gives  an  npper  bonnd  on  the  score  of  hnite  ones. 


p\r 

1 

2 

4 

8 

16 

0.5 

0.42 

1.17 

2.07 

2.58 

2.74 

0.7 

0.63 

1.96 

4.13 

5.71 

6.31 

0.9 

0.87 

3.15 

9.17 

17.55 

22.74 

0.93 

0.91 

3.38 

10.63 

22.91 

32.21 

0.95 

0.93 

3.55 

11.82 

28.39 

43.70 

0.97 

0.96 

3.72 

13.25 

36.85 

66.41 

0.99 

0.99 

3.90 

14.98 

51.58 

132.48 

1 

1 

4 

16 

64 

256 

Table  2:  The  saliency  valnes  of  circles  with  nniform  gaps  as 
a  fnnction  of  p  and  r  (for  p  =  0.7). 


lines,  the  saliency  of  circles  attennates  very  fast  with  gap 
size.  For  example,  the  saliency  of  a  circle  of  radins  16 
that  contains  no  gaps  is  256.  With  5%  gaps  its  saliency 
rednces  to  43.70.  This  saliency  (43.70)  is  identical  to 
the  saliency  of  a  gap-free  circle  of  radins  6.61.  Similarly, 
with  10%  gaps  the  saliency  of  the  same  circle  rednces 
to  22.74,  which  corresponds  to  the  saliency  of  a  gap-free 
circle  of  radins  4.77. 

Next,  we  analyze  the  case  of  a  short  cnrve,  F,  that 
lies  near  a  circle  snch  that  the  two  are  not  tonching. 
Again,  we  shall  ask  whether  snch  a  cnrve  may  become 
more  salient  than  the  circle  by  nsing  the  saliency  of  the 
circle.  Let  T(F)  denote  the  saliency  of  F,  let  g  be  the 
gap  length  between  F  and  the  circle,  and  let  K  be  the 
total  sqnared  cnrvatnre  of  F  pins  the  gap  to  the  circle. 
The  saliency  Te  of  the  first  element  on  F  is  given  by 


$e  =  $(r)+/e-^$e. 

(33) 

We  obtain  that  Te  >  (recall  that 

=  r^)  when 

T(F)  -h  e~^ >  r^. 

(34) 

which  implies  that 

l<I>(r)>l-/e-^. 

(35) 

Note  that  since  p  <  1  the  right-hand  side  grows  larger 
as  the  gap  size  increases.  Conseqnently,  the  chance  of  an 
element  becoming  more  salient  than  a  circle  by  taking 
its  saliency  from  the  circle  decreases  with  the  gap  size. 
Snppose  finally  that  F  is  a  straight  line  of  length  I  snch 
that  its  continnation  is  tangential  to  the  circle,  in  which 


case  T(F)  =  /,  K  —  0.  The  condition  (Eq.  35)  becomes 

For  I  —  1  and  p  =  .7  we  obtain  that  F  is  almost  never 
more  salient  than  the  circle: 

1  >  1  -  .7^  (37) 

or 

1 

r  <  ■  . 

Vl  -  .7^ 

From  this  eqnation,  r  mnst  be  extremely  small  to  allow 
an  element  to  win  with  gaps:  For  ^  =  1,  we  have  r  < 
1.826,  and  for  ^  =  2,  we  have  r  <  1.400.  As  I  increases 
the  likelihood  of  F  becoming  more  salient  increases. 

The  final  issne  we  discnss  is  the  saliency  measnre’s 
preference  for  how  gaps  are  distribnted  along  a  cnrve.  El¬ 
der  and  Zncker  [4]  condncted  experiments  which  demon¬ 
strate  that,  when  a  fraction  of  the  bonndary  of  an  object 
is  missing,  hnmans’  recognition  ability  is  hindered  more 
when  the  missing  fraction  is  contained  all  in  one  gap 
than  when  spread  over  several  gaps.  For  any  cnrve,  the 
saliency  measnre  enconrages  gaps  to  be  as  far  as  possi¬ 
ble  from  the  starting  point.  For  an  open  cnrve  with  a 
fixed  total  gap  length,  the  best  and  worst  cases  are  when 
the  cnrve  has  one  large  gap  at  the  start  (worst)  or  end 
(best).  Since  the  network  evalnates  the  saliency  of  cnrves 
from  all  possible  starting  points  it  prefers  that  gaps  are 
pnshed  as  far  as  possible  from  the  smooth  sections  of  the 
cnrve. 

While  for  open  cnrves  there  is  no  clear  preference  for  a 
single  long  gap  versns  a  few  short  gaps,  for  closed  cnrves 
snch  a  preference  does  exist.  Consider  a  circle  F  with  one 
large  gap.  Let  Fi  be  the  open  cnrve  corresponding  to  the 
part  of  the  circle  that  is  actnal,  and  let  F2  be  the  gap. 
The  most  salient  element  on  the  circle  will  be  the  first 
element  of  Fi,  since  the  saliency  measnre  prefers  gaps  to 
be  as  far  as  possible  from  the  start  of  the  cnrve.  So  the 
most  salient  cnrve  will  go  first  throngh  F 1 ,  then  throngh 
F2,  and  then  loop  back  to  Fi.  Let  ar  denote  the  length 
of  gap  F2.  Since  only  the  actnal  elements  contribnte  to 
the  saliency  of  a  cnrve,  the  saliency  obtained  by  going 
once  aronnd  the  circle  is  simply  T(Fi).  Using  Eq.  10  the 
saliency  of  the  circle  becomes 


<i>(r) 


1 


(39) 


If  the  circle  now  contains,  say,  two  gap  sections  of  the 
same  total  length  ar,  then  the  saliency  obtained  by  going 
once  aronnd  the  circle  will  be  rednced.  This  is  becanse 
a  gap  will  be  closer  to  the  start  of  the  cnrve.  As  a  con- 
seqnence,  the  nnmerator  in  Eq.  39  will  become  smaller. 
The  denominator,  however,  will  remain  nnchanged  since 
the  total  gap  length  and  cnrvatnre  do  not  change.  This 
analysis  clearly  applies  when  the  circle  is  fragmented  by 
more  than  two  gaps.  Conseqnently,  the  saliency  of  the 
circle  will  become  smaller  as  a  resnlt  of  fragmentation. 
An  example  is  given  in  Fig.  6.  The  fignres  shows  three 
circles  of  the  same  radins  and  with  the  same  total  gap 
size.  The  network  prefers  the  one  that  contains  one  long 
gap  over  the  ones  in  which  the  gaps  are  fragmented.  This 
behavior  disagrees  with  Elder  and  Zncker’s  resnlts. 


Figure  6:  Three  circles  of  the  same  radius  with  the  same  total  gap  size.  Using  Shashua  and  Ullman’s  network  the  saliency 
values  are  46.85  (left),  27.93  (middle),  and  23.27  (right). 


4  Complexity  and  Convergence 
Analysis 


In  this  section  we  analyze  the  complexity  of  Shashua  and 
Ullman’s  saliency  network.  Denote  the  total  number  of 
pixels  in  the  image  by  p  and  the  number  of  discrete  orien¬ 
tation  elements  at  every  pixel  by  b.  The  network  has  pb 
elements.  At  each  iteration  every  element  has  to  evaluate 
all  the  saliencies  obtained  from  elements  connected  to  it. 
The  complexity  of  each  iteration  therefore  is  pb‘^ .  The 
question  then  is  how  many  iterations  are  required  before 
the  network  converges.  Clearly,  if  we  did  not  allow  cycles 
the  longest  curve  may  be  of  length  p,  and  so  the  total 
complexity  of  the  computation  would  be  at  most 
But  when  cycles  are  considered,  we  show  below  that  the 
network  converges  in  a  linear  number  of  iterations,  and 
so  the  total  complexity  is  indeed  0(p‘^b‘^). 

Given  a  cycle  T,  denote  by  the  score  obtained  af¬ 
ter  going  n  times  around  the  cycle,  by  K  the  energy  of 
T,  and  by  g  the  total  gap  size.  Then  from  Eq.  10  the 
saliency  of  T  is  $  =  $i/(l  — p^e“^).  After  going  n  times 
around  the  cycle,  the  accumulated  score  becomes  (this 
is  the  finite  sum  of  the  geometric  series  in  Eq.  10) 


I-  png^-nK 

1  —  p3e  ^ 

Define  the  relative  error  by 
$  -  $ 


E  = 


$ 


(40) 


(41) 


We  can  now  compute  the  number  of  cycles,  n,  needed  to 
achieve  an  T”  =  c  error: 


In  6  =  n(g  In  p  —  A") 


In  ( 


(42) 


glnp  —  K 

Assume  E  is  a  circle  of  radius  r  with  no  gaps.  Then 
^  and 

r 

r  In  6  ,  , 

^  (43) 

The  number  of  cycles  around  the  circle  is  0(r).  As¬ 
suming  one  iteration  covers  one  unit  of  arc  length,  the 
number  of  iterations  for  each  cycle  is  27rr.  Thus,  from 
Eq.  43  the  total  number  of  iterations  needed  to  achieve 
an  6  error  is 


N  =  27rrn  =  —r  In  c. 


(44) 


Consequently,  the  total  number  of  iterations  required  is 
O(r^)  =  0(p),  where  p  is  the  size  of  the  image.  As 


an  example,  the  number  of  cycles  required  to  achieve 
1%  error  (c  =  0.01,  Inc  «  —4.605)  is  n  «  2.303  t/tt  « 
0.733r,  and  therefore  the  number  of  iterations  is  TV  ^ 
4.605r^. 

Fig.  7  shows  an  image  of  a  gap-free  and  a  fragmented 
circle  on  a  noisy  background.  As  expected,  the  Saliency 
Network  chooses  the  gap-free  circle  as  the  most  salient 
curve.  Using  Eq.  44,  we  could  predict  the  number  of 
iterations  for  the  network  to  converge  on  the  gap-free 
circle:  The  radius  of  the  circle  is  r  ^  11.39,  and  one 
iteration  covers  an  arc  length  As  ^  2.983  (r  and  As  are 
discussed  in  the  next  section).  So  to  obtain  1%  error, 
Eq.  44  gives 


/  27rr\  4.605r^ 

\~A^ 


200.4. 


(45) 


We  ran  the  Network  for  200  iterations  on  the  left  im¬ 
age  in  Fig.  7,  and  the  maximum  saliency  converged  to 
130.8.  This  generally  agrees  (for  a  1%  relative  error) 
with  =  129.8,  as  predicted  by  Eq.  17,  and  with  the 
exact  saliency  of  the  circle  under  discretization,  132.1 
(see  next  section). 

In  Fig.  7,  the  input  image  has  dimensions  128  x  128, 
and  the  example  was  run  on  a  network  with  24  orien¬ 
tation  elements  per  pixel.  The  200  iterations  took  54 
minutes  using  C  code  on  a  Sun  SPARCstation  5  with 
32M  of  memory.  Note  that  this  time  for  convergence 
is  independent  of  the  number  of  background  elements. 
So  the  execution  time  would  be  the  same  if  the  gap-free 
circle  were  alone  in  the  image.  To  illustrate  this  point, 
Fig.  14  shows  an  example  of  two  circles,  the  larger  of 
which  is  the  gap-free  circle  from  Fig.  7.  The  input  im¬ 
age  contains  no  clutter,  but,  nevertheless,  as  before  the 
Saliency  Network  took  55  minutes  to  converge. 

By  taking  the  maximal  possible  circle  in  the  image, 
we  account  for  the  worst  case  complexity  of  the  network. 
This  is  because  any  larger  closed  curve  must  accumulate 
comparable  energy  in  order  not  to  exceed  the  boundaries 
of  the  image.  We  can  therefore  conclude  that  the  worst 
case  complexity  of  the  network  is  0(p^6^),  which  is  the 
squared  number  of  elements  in  the  network. 


5  Discrete  Implementations 

Our  analysis  of  Shashua  and  Ullman’s  method  has  con¬ 
centrated  on  the  theoretical,  continuous  version  of  their 
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Figure  7:  Running  the  Saliency  Network  on  an  image  with  gap-free  and  fragmented  circles  and  a  background  of  200  random 
line  segments  (at  the  left).  The  saliency  map  and  most  salient  curve  image  are  shown  in  the  center  and  right  pictures, 
respectively.  After  200  iterations,  the  maximum  saliency  was  130.8.  The  time  to  convergence  and  the  maximum  saliency  are 
independent  of  the  number  of  background  elements. 


saliency  measure.  Shashua  and  Ullman  proposed  to  com¬ 
pute  this  measure  using  a  network  of  finitely  many,  lo¬ 
cally  connected  elements.  In  this  section  we  analyze  the 
effect  of  computing  the  saliency  measure  on  discrete  net¬ 
works.  We  show  in  particular  that  the  network  is  ex¬ 
tremely  sensitive  to  the  number  of  discrete  orientation 
elements  allocated  per  pixel. 

Shashua  and  UllmanA  network  has  the  following 
structure.  Let  p  be  the  number  of  pixels  in  the  image, 
and  let  b  be  the  number  of  orientation  elements  at  each 
pixel.  (Shashua  and  Ullman  set  b  =  16.)  The  network 
contains  p  x  b  processors,  a  processor  for  every  orienta¬ 
tion  element  at  every  pixel  in  the  image.  A  continuous 
arc  is  assigned  between  every  two  elements  that  meet  at 
the  same  pixel  in  the  underlying  image.  The  local  cur¬ 
vature  K  corresponding  to  such  an  arc  is  approximated 
using  the  formula 


where  a  denotes  the  angle  between  the  neighboring  ele¬ 
ments  and  Ae  denotes  the  length  of  an  orientation  ele¬ 
ment.  This  formula  represents  the  curvature  of  a  circu¬ 
lar  arc  that  joins  the  midpoints  of  two  elements  of  the 
same  length.  As  an  example,  the  gap-free  circle  of  Fig.  7 
was  generated  using  a  24-sided  regular  polygon  with  one 
element  per  side  and  with  Ae  =  3.  Then  a  =  7r/12, 
and  Eq.  46  gives  k  ^  .08777.  Therefore,  the  radius 
of  the  circular-arc  approximation  is  r  =  1/k  ^  11.39, 
which  gives  the  arc  length  and  total  squared  curvature 
covered  by  one  iteration  to  be  As  =  ar  ^  2.983  and 
K  =  a/r  ^  .02298,  respectively. 

Shashua  and  Ullman  set  Ae  to  be  constant,  and  hence 
ignored  the  different  sizes  of  elements  of  different  orien¬ 
tations.  As  a  result,  a  horizontal  or  vertical  line  of  length 
!  obtains  the  same  saliency  as  a  diagonal  line  of  length 
l\/2.  Shashua  and  UllmanA  implementation  therefore 
encourages  curves  that  are  aligned  with  the  main  axes 
of  the  image. 

A  more  critical  issue  is  the  number  of  orientation  el¬ 
ements  in  the  network.  Consider  for  example  a  nearly 


Figure  9:  Discretizing  a  circle  with  a  regular  polygon. 

horizontal  straight  line  segment.  Due  to  aliasing,  the  line 
may  be  cut  in  the  middle  so  that  one  part  of  the  line  is 
raised  up  by  one  pixel  (see  Fig.  8).  Let  21  be  the  length 
of  the  line.  The  saliency  of  the  first  element  along  the 
line  is  given  by 

=  /  +  e-^  +  (/-l)e-2^,  (47) 

where  K  is  the  total  squared  curvature  over  the  change 
in  orientation  a  corresponding  to  raising  the  line  up  by 
one  pixel  (which  is  also  the  total  squared  curvature  for 
when  the  line  returns  to  horizontal). 

Consider  now  a  pair  of  lines  of  length  !  meeting  at  a 
corner  such  that  they  form  the  same  orientation  change 
a.  Since  a  corner  forms  only  one  turn  the  obtained 
saliency  will  be 

=  /  +  /e--^.  (48) 

Consequently,  we  obtain  the  paradoxical  result  that  the 
corner  is  more  salient  than  the  nearly  straight  continua¬ 
tion.  Hence  straight  lines  oriented  such  that  they  devi¬ 
ate  slightly  from  horizontal  will  often  be  less  salient  than 
corners. 

The  discretization  problem  is  carried  over  to  other, 
more  complicated  examples.  Consider  a  circle  of  ra¬ 
dius  r.  When  r  is  sufficiently  small,  the  circle  can  be 
approximated  by  a  regular  polygon  where  each  side  in¬ 
cludes  a  single  orientation  element  (Fig.  9).  Let  K 
be  the  total  squared  curvature  corresponding  to  a  turn 
a  =  27r/n,  where  n  is  the  number  of  sides  of  the  polygon. 
The  discrete  saliency  of  such  a  regular  polygon  is  given 
by 

$  =  l  +  e-^+e 
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Figure  8:  Discretization  effect  on  a  straight  line.  Left  fignre:  the  discretization  of  a  straight  line.  Right  hgnre:  a  corner.  The 
saliency  valne  obtained  for  a  perfectly  horizontal  line  of  length  20  is  20.00,  the  saliency  valne  for  a  straight  line  of  the  same 
length  is  18.41,  and  the  saliency  valne  of  a  corner  is  19.10. 


1 

1-e-^ 

1 

As 

1  —  e 


(49) 


where  As  is  the  arc  length  of  the  circle  that  is  covered 
in  one  iteration.  Returning  again  to  the  gap-free  circle 
in  Fig.  7,  for  this  circle  r  =  11.39  and  As  =  2.983  (see 
above),  and  so  under  discretization  its  saliency  is  132.1. 
When  As/r^  is  small, 


1 


_  As 

1  -  e 


(50) 


The  approximation  in  this  equation  improves  as  r  in¬ 
creases;  this  happens  when  the  number  of  sides  in  the 
polygonal  approximation  increases  and  as  a  result  fits  a 
circle  more  closely.  When  r  is  big  so  that  a  good  approx¬ 
imation  by  a  regular  polygon  would  require  finer  orienta¬ 
tion  changes  (less  than  27r/6),  a  faithful  discretization  of 
the  circle  would  involve  many  inflections  (that  is,  clock¬ 
wise  turns  balanced  by  counter-clockwise  turns).  These 
inflections  would  be  penalized  unduly  by  the  network. 

We  could  improve  the  saliency  of  the  discretization 
if  we  instead  represent  the  circle  by  a  regular  polygon 
with  b  sides;  each  side  now  contains  more  than  one  ele¬ 
ment.  This,  however,  will  still  not  result  in  a  reasonable 
approximation  to  the  continuous  saliency  of  the  circle. 
This  can  be  seen  by  the  following  observation.  Eq.  49 
gives  the  saliency  of  a  regular  polygon  with  b  sides,  each 
of  unit  length,  in  terms  of  K,  the  total  squared  curva¬ 
ture  assigned  for  a  turn  of  The  saliency  of  a  similar 
regular  polygon  in  which  every  side  is  of  length  !  is  given 

by 

I-' =  lAzir  =  (-51) 


The  saliency  of  a  regular  6-sided  polygon,  therefore,  in¬ 
creases  linearly  with  the  length  of  each  side,  /.  Since  !  is 
directly  related  to  the  radius  of  the  circumscribed  circle, 
the  saliency  of  the  polygon  also  increases  linearly  with 
the  radius  of  that  circle.  Since  the  continuous  saliency  of 
a  circle  grows  quadratically  with  the  radius  of  the  circle 
(Eq.  17),  we  obtain  that,  as  r  grows,  the  saliency  of  the 
polygon  will  considerably  underestimate  the  saliency  of 
the  circle. 

The  results  shown  in  this  section  establish  that  the 
Saliency  Network  faces  serious  difficulties  due  to  dis¬ 
cretization  of  the  range  of  orientations.  A  faithful  im¬ 
plementation  of  the  continuous  saliency  measure  would 
require  a  very  fine  discretization.  The  number  of  orien¬ 
tation  elements  needed  to  completely  avoid  the  problems 
mentioned  in  this  section  is  of  the  order  of  where  p  is 
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the  total  number  of  pixels  in  the  image.  With  this  num¬ 
ber  of  orientation  elements  the  overall  time  complexity 
of  the  network  (see  Section  4)  becomes  0(p‘^b‘^)  =  O(p^). 

6  Applications  to  Grouping 

The  Saliency  Network  is  viewed  by  many  people  not  only 
as  a  mechanism  for  shifting  attention  to  salient  struc¬ 
tures,  but  also  as  a  method  for  the  initial  grouping  of 
curves.  The  problems  of  identifying  salient  structures 
and  the  grouping  of  curves  are  not  identical.  Saliency 
can  be  viewed  as  the  problem  of  identifying  the  “odd 
man  out,”  whereas  grouping  is  the  problem  of  identify¬ 
ing  image  structures  that  are  likely  to  belong  to  a  sin¬ 
gle  object.  The  criteria  of  length  and  straightness  can 
separate  a  smooth  object  from  a  background  of  short, 
broken  curves  (e.g.,  a  disc  on  a  background  of  grass), 
but  they  may  be  inappropriate  for  segmenting  equally- 
smooth  objects  in  cluttered  scenes,  since  long  smooth 
curves  often  will  traverse  a  few  objects.  For  example, 
Fig.  10  shows  a  case  where  the  Saliency  Network  succes¬ 
sively  finds  a  curve  that  belongs  to  an  object  of  interest, 
but  Fig.  11  shows  another  case  where  the  most  salient 
curve  traverses  more  than  one  object.  Nevertheless,  in 
many  cases  the  salient  curves  may  lie  on  objects  of  in¬ 
terest,  and  so  may  be  useful  for  grouping. 

The  Saliency  Network  computes,  for  every  element, 
the  saliency  of  the  most  salient  curve  emerging  from 
that  element.  For  grouping,  we  would  like  to  recover 
the  curves  that  made  those  locations  salient.  In  fact,  we 
show  below  that,  after  the  network  converges,  the  most 
salient  curves  can  be  extracted  in  the  following  simple 
way,  which  was  proposed  by  Shashua  and  Ullman.  To 
extract  the  optimal  (most  salient)  curve  emerging  from 
an  element,  during  the  computation  one  has  to  store  for 
every  element  p  a  single  pointer  7r(p)  which  points  to  the 
second  element  on  the  optimal  curve  emerging  from  p. 
At  the  end  of  the  computation,  the  best  curve  from  p 
can  be  retrieved  by  tracing  these  pointers  starting  from 
p.  To  obtain  the  most  salient  curve  in  the  image,  we 
would  trace  from  the  most  salient  element. 

This  tracing  procedure  follows  from  the  property  of 
extensibility.  The  basic  idea  of  extensibility,  which  is 
illustrated  in  Fig.  12,  is  that  at  convergence  any  suffix 
of  an  optimal  curve  is  optimal  as  well.  The  following 
argument  shows  that  at  convergence  the  tracing  proce¬ 
dure  produces  the  optimal  curves.  At  any  iteration  TV, 
we  know  from  the  definition  of  extensibility  (Eq.  3)  that 
$(p)  is  the  saliency  of  the  most  salient  curve  emerging 
from  p  among  all  curves  leaving  p  of  length  less  than 
or  equal  to  TV,  and  we  know  that  7r(p)  points  to  the 
next  element  on  that  most  salient  curve.  Therefore,  at 


Figure  10:  Shashua  and  Ullman’s  saliency  map  for  a  cluttered  scene.  The  scene  image  (on  the  left)  was  smoothed  with  a 
Gaussian  of  standard  deviation  1  and  then  the  gradient  magnitude  was  thresholded  to  get  a  binary  image  (second  picture 
from  the  left).  This  edge  image  was  the  input  to  the  network.  The  third  picture  displays  Shashua  and  UllmanT  saliency  map, 
and  the  fourth  shows  the  curve  (71  elements)  emanating  from  the  most  salient  element,  for  which  T  =  263.5. 


Figure  11:  Shashua  and  UllmanT  saliency  map  for  a  cluttered  scene.  From  left  to  right,  the  hrst  picture  is  the  scene  image 
and  the  second  is  an  edge  image  obtained  from  the  scene  image  by  thresholding  the  gradient  magnitude.  The  edge  image  was 
the  input  to  the  network.  The  third  picture  displays  Shashua  and  UllmanT  saliency  map,  and  the  fourth  shows  the  curve  (51 
elements)  emanating  from  the  most  salient  element,  for  which  T  =  210.1. 
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Figure  12:  Characterization  of  extensible  functions.  If  the 
most  salient  curve  from  p*  goes  through  pj  then,  at  conver¬ 
gence,  the  most  salient  curve  from  p*  must  coincide  with  the 
most  salient  curve  from  pj.  At  any  hnite  time,  however,  the 
most  salient  curves  from  p*  and  pj  may  not  overlap  anywhere 
except  at  pj.  In  particular,  after  n  iterations  the  most  salient 
curve  from  p*  will  be  the  straight  line  of  length  n,  but  the 
most  salient  curve  from  pj  could  be  along  the  curved  segment 
from  pj  to  pk. 


TV  =  cxD  (i.e.,  at  convergence),  $(p)  is  the  saliency  of 
the  most  salient  curve  emerging  from  p  among  all  possi¬ 
ble  curves,  and  7r(p)  points  to  the  next  element  on  that 
curve.  We  will  assume  for  simplicity  that  the  optimal 
curve  from  p  is  unique.  Let  F  =  {po,Pi,P2,  •  •  •)  be  the 
optimal  curve  from  some  element  po-  Then  for  any  suf- 
fix  Ti  of  r  (Fi  =  {pi,pi+i,pi+2,  ■  ■  ■),  i  >  0),  Vi  must 
be  the  optimal  curve  from  pi  —  otherwise,  if  a  different 
curve  F*  were  more  salient  than  F*,  then  from  Eq.  5  we 
could  substitute  F*  for  F*  and  obtain  a  new  curve  from 
po  that  is  more  salient  than  F.  But  if  F*  is  optimal,  then 
7^{pi)  must  equal  Pz+i,  since  7^{pi)  points  to  the  next  el¬ 
ement  on  the  optimal  curve  from  pi.  Thus  following  the 
pointers  traces  out  the  optimal  curve. 

The  fact  that  the  tracing  procedure  discussed  above 
supplies  the  optimal  curves  has  serious  implications  for 
grouping.  When  two  curves  share  a  common  section  (as 
in  Fig.  13),  the  elements  on  the  common  section  must 
decide  between  the  two  curves.  So  if  two  different  objects 
are  touching,  then  always  the  best  curve  through  one  of 
the  objects  will  merge  into  the  other;  this  situation  is 
illustrated  by  the  real  image  example  in  Fig.  11,  where 
the  boundary  curves  of  two  objects  (a  flashlight  and  a 
telephone)  merge  together. 

The  two-circle  example  can  also  be  problematic  for 
grouping  due  to  the  problem  of  leeching.  Leeching  can 
cause  non-salient  curves  next  to  a  salient  one  to  include 
the  salient  curve  as  part  of  them.  We  have  already  seen 
an  example  in  which,  due  to  this  property,  a  non-salient 
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Figure  13:  The  problem  of  leeching.  Each  element  of  a 
cnrve  chooses  one  neighboring  element  with  which  to  com¬ 
bine.  Conseqnently  the  shared  element  mnst  choose  between 
the  two  shapes,  and  so  the  best  cnrves  emerging  from  p* 
and  pj  will  merge  together.  The  larger  circle  is  the  most 
salient  cnrve,  and,  for  all  elements  pj  on  the  larger  circle, 
^(pj)  =  .  The  elements  on  the  smaller  circle  draw  their 

saliencies  from  the  larger  circle,  and  the  saliencies  decrease 
as  the  elements  get  fnrther  from  the  jnnction  element.  For 
every  element  p*  on  the  smaller  circle,  <  T(p^)  <  R^ . 


curve  becomes  salient  unduly  (Section  3.2).  Another  ex¬ 
ample  is  shown  in  Fig.  13,  in  which  the  elements  on 
the  smaller  circle  draw  their  saliencies  from  the  larger 
circle,  and  as  a  result  the  most  salient  curves  emanat¬ 
ing  from  these  elements  combine  with  the  larger  circle. 
In  addition,  we  show  next  that  the  smaller  circle  can 
only  be  traced  from  the  least  salient  element  over  both 
curves;  this  could  be  problematic  if  a  grouping  system 
wishes  to  recover  both  circles.  Consider  an  element  pi 
on  the  smaller  circle,  and  let  s  be  the  arc  length  from  pi 
to  the  connecting  element  between  the  two  circles  (de¬ 
noted  by  pfc).  The  saliency  of  the  larger  circle  at  con¬ 
vergence  is  according  to  Eq.  17.  From  Eq.  16,  the 
saliency  of  a  circular  arc  of  extent  s  on  the  smaller  circle 
is  r^(l  —  e~^).  Finally,  using  Eq.  9  we  can  derive  the 
saliency  of  pi : 

<F(p*)  =  r^(l  -  e“^)  + 

=  -\- {R^  —  r‘^)e~  (52) 

It  can  be  readily  seen  that  $(pz)  decreases  as  s,  the  arc 
length  from  p*  to  increases.  Therefore,  the  saliencies 
of  the  elements  on  the  smaller  circle  decrease  as  the  ele¬ 
ments  get  further  away  from  the  junction  element,  with 
the  constraint  that  <  $(pz)  <  .  As  a  consequence, 

if  a  grouping  system  were  to  try  and  recover  the  smaller 
circle,  it  would  have  to  trace  the  curve  from  the  least 
salient  element  on  both  curves. 

Fig.  14  shows  the  results  of  the  Saliency  Network  on 
an  analogous  two-circle  example.  To  get  the  optimal 
curves,  we  first  traced  the  curve  from  the  most  salient 
element  (for  which  $  =  130.8),  which  gave  the  larger 
circle.  To  compute  the  second  most  salient  curve,  we  ig¬ 
nored  the  elements  on  the  most  salient  curve  and  selected 
among  the  remaining  elements  the  next  most  salient  ele¬ 
ment.  We  then  traced  the  curve  from  this  element.  The 
traced  curve  emerged  from  the  selected  element  and  the 


went  around  the  larger  circle.  We  repeated  this  process 
to  obtain  the  third  most  salient  curve.  The  new  curve  re¬ 
sembled  the  second  most  salient  curve  again,  except  that 
it  was  one  element  longer.  As  discussed  above,  elements 
near  the  most  salient  cycle  tend  to  merge  with  the  cycle 
and  then  draw  their  saliencies  from  it.  The  saliencies  of 
these  elements  attenuate  as  they  become  further  away. 
To  extract  the  smaller  circle,  we  would  have  to  trace  the 
curve  from  the  least  salient  active  element.  Fig.  15  shows 
another  example  in  which  the  same  phenomenon  occurs, 
except  this  time  the  larger  circle  is  less  salient  because 
of  gaps.  Similar  to  the  previous  case,  tracing  from  the 
least  salient  element  in  the  image  is  needed  to  recover 
the  larger  circle. 

Thus  far  our  analysis  has  concentrated  on  the  asymp¬ 
totic  behavior  of  the  Saliency  Network.  In  their  exper¬ 
iments,  Shashua  and  Ullman  demonstrated  that  good 
results  could  be  obtained  already  after  a  few  dozen  it¬ 
erations.  In  this  they  relied  on  the  property  that  after 
the  nTh  iteration  the  score  attributed  by  the  network  to 
every  element  represents  the  saliency  of  the  best  curve 
of  length  n  +  1  emanating  from  the  element.  There  is 
a  drawback  to  this  approach,  however.  Whereas  after 
running  the  network  for  a  small  number  of  iterations  the 
saliency  values  obtained  for  short  curves  already  approx¬ 
imate  their  asymptotic  saliencies,  long  curves  still  are  un¬ 
dervalued  significantly.  This  is  particularly  problematic 
when  closed  curves  are  considered,  because  their  asymp¬ 
totic  scores  benefit  from  being  considered  infinitely  long. 
Thus,  when  the  network  is  run  for  a  relatively  small  num¬ 
ber  of  iterations,  closed  curves  are  evaluated  as  if  they 
were  short,  open  curves,  and  as  a  result  closure  would 
not  be  encouraged  by  the  network. 

Furthermore,  when  the  network  is  not  run  to  conver¬ 
gence  the  tracing  procedure  is  not  guaranteed  to  extract 
the  best  curve.  Consider  for  instance  the  picture  in  Fig¬ 
ure  12.  The  picture  contains  a  straight  line  of  length  n 
emerging  from  an  element  p*,  and  it  contains  a  curved 
segment  between  elements  pj  and  pk,  which  merges  into 
the  straight  line.  We  choose  the  curved  segment  so  that 
after  n  iterations  it  is  more  salient  (due  to  having  greater 
length)  than  the  portion  of  the  straight  line  to  the  right 
of  Pj .  Consequently  after  n  iterations  7r(pj  )  will  point 
to  the  curved  segment.  As  well,  after  the  n  iterations 
the  best  curve  emerging  from  pi  will  be  the  straight  line 
of  length  n  (and  its  current  saliency  will  be  n).  But  if 
we  now  trace  the  pointers  starting  from  p*,  we  will  mis¬ 
takenly  think  that  the  best  curve  of  length  n  contains  a 
portion  of  the  curved  segment  between  pj  and  pk-  This 
problem  could  be  avoided  if  the  entire  history  of  the  com¬ 
putation  were  stored,  but  that  of  course  would  increase 
the  storage  space  required  by  the  method  considerably. 

To  conclude,  Shashua  and  Ullman’s  Saliency  Network 
may  be  used  for  grouping,  because  it  both  is  efficient  and 
is  guaranteed  to  find  the  optimal  curves  in  an  image,  ac¬ 
cording  to  a  measure-of-fit  that  prefers  length,  straight¬ 
ness,  and  few  gaps.  When  the  network  reaches  conver¬ 
gence,  the  optimal  curves  in  the  image  can  be  extracted 
through  a  straightforward  tracing  procedure.  The  algo¬ 
rithm  is  efficient  because,  as  we  have  shown  in  Section  4, 
it  searches  the  exponential  space  of  possible  image  curves 
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Figure  14:  Shashua  and  Ullman’s  method  at  image  junctions.  The  second,  third,  and  fourth  most  salient  curves  start  from 
the  open  curve  attached  to  the  circle  and  then  proceed  around  the  circle.  The  saliencies  of  the  top  four  curves  are  130.8,  122.1, 
114.1,  and  106.9.  The  saliency  of  the  least  salient  active  element  is  68.2. 


in  time  that  is  polynomial  (either  quadratic  or  cubic)  in 
the  size  of  the  image.  To  speed-up  the  computation  even 
further,  Shashua  and  Ullman  recommended  running  the 
network  for  a  small  number  of  iterations.  If  not  run  to 
convergence,  however,  the  network  is  no  longer  guaran¬ 
teed  to  provide  the  optimal  curves,  and  for  longer  curves 
the  computed  saliencies  can  be  significantly  underval¬ 
ued.  Even  if  run  to  convergence,  due  to  curve  junctions 
the  method  still  has  serious  problems  in  extracting  any 
salient  curve  other  than  the  best.  The  Saliency  Network, 
therefore,  may  be  useful  for  directing  attention  to  a  sin¬ 
gle  object,  but  will  be  unsuitable  in  cluttered  images  for 
extracting  a  number  of  different  objects. 

7  Conclusion 

The  Saliency  Network  is  a  mechanism  for  identifying 
salient  curves  in  images  based  on  length  and  straightness. 
The  method  is  attractive  for  several  reasons.  First,  the 
measure  of  saliency  generally  prefers  long  and  smooth 
curves  over  short  or  wiggly  ones.  In  addition,  the  net¬ 
work  is  guaranteed  to  find  the  most  salient  structure 
according  to  the  measure.  While  so  doing,  the  network 
fills  in  gaps  with  smooth  completions  and  tolerates  noise. 
The  network  itself  is  locally  connected  and  its  size  is  pro¬ 
portional  to  the  size  of  the  image.  The  locality  is  further 
emphasized  since  the  contribution  of  remote  elements  to 
the  score  of  a  given  element  attenuates  with  the  curva¬ 


ture  and  gap  length  separating  the  remote  elements  from 
the  given  element. 

Our  analysis  revealed,  however,  certain  weaknesses 
with  the  method.  We  found  cases  in  which  the  most 
salient  element  does  not  lie  on  the  perceptually  most 
salient  curve.  Furthermore,  we  showed  cases  in  which 
the  saliency  measure  changes  its  preferences  when  curves 
are  scaled  uniformly.  Finally,  we  found  that  for  certain 
fragmented  curves  the  measure  prefers  large  gaps  over  a 
few  small  gaps  of  the  same  total  size. 

We  believe  that  the  weaknesses  of  the  Saliency  Net¬ 
work  are  due  largely  to  two  important  properties  of  the 
saliency  measure  which  are  imposed  by  the  Network’s 
computation.  The  two  properties  are  (1)  extensibility 
and  (2)  geometric  convergence  for  cycles.  Extensibility 
implies  that  an  optimal  curve  must  be  composed  of  sub¬ 
curves  that  are  themselves  optimal.  Due  to  extensibil¬ 
ity,  saliencies  can  be  computed  efficiently  using  a  proce¬ 
dure  of  recursive  optimization  (dynamic  programming). 
One  of  the  benefits  of  extensibility  is  that,  although  the 
Saliency  Network  finds  the  element  from  which  the  best 
curve  emanates  rather  than  extracting  the  best  curve 
itself,  the  best  curve  can  be  extracted  through  a  sim¬ 
ple  tracing  procedure.  Also  due  to  extensibility,  how¬ 
ever,  the  method  has  difficulties  at  junctions;  this  leaves 
unclear  how  one  could  use  the  method  for  grouping  in 
cluttered  images. 
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Figure  15:  Shashua  and  Ullman’s  method  at  image  junctions.  The  gaps  in  the  larger  circle  cause  it  to  be  less  salient  than 
the  smaller  circle.  The  saliencies  of  the  top  four  curves  are  33.3,  13.9,  7.54,  and  5.47.  The  saliency  of  the  least  salient  active 
element  is  4.46. 


The  second  property  exhibited  by  the  saliency  mea¬ 
sure  is  that  the  measure  decreases  in  a  geometric  series 
when  evaluated  along  a  cycle.  This  property,  which  is 
essential  for  convergence,  was  used  in  this  paper  to  com¬ 
pute  the  network’s  time  complexity.  In  particular,  we 
showed  that  the  number  of  iterations  is  linear  in  the  size 
of  the  image,  and  as  a  consequence  the  overall  complexity 
in  serial  implementations  is  quadratic  in  the  size  of  the 
network.  This  complexity  result  is  based  on  the  assump¬ 
tion  that  the  number  of  discrete  orientations  per  pixel 
is  independent  of  the  size  of  the  image.  On  the  other 
hand,  we  also  showed  the  network’s  rankings  of  curves 
can  be  significantly  altered  when  the  range  of  possible 
orientations  is  coarsely  sampled.  With  proper  sampling, 
however,  the  complexity  of  the  network  becomes  cubic 
in  the  size  of  the  image. 

In  sum,  extensibility  and  geometric  convergence  en¬ 
able  the  saliency  measure  to  be  optimized  and  the  op¬ 
timal  curves  to  be  recovered  efficiently  (in  polynomial 
time),  but  at  the  same  time  they  restrict  the  set  of  pos¬ 
sible  functions  that  can  be  used  as  measures  of  saliency. 
Partly  due  to  this  restriction,  the  chosen  measure  has 
certain  properties  which  counter  that  which  we  believe 
is  expected  in  a  measure  of  saliency.  It  remains  to  be 
seen  whether  variations  of  the  current  measure  can  be 
defined  that  remedy  some  of  its  weaknesses  while  still 
allowing  the  saliency  map  and  most  salient  curves  to  be 


computed  efficiently. 

Code  availability 

We  have  made  available  our  C-code  implementation  of 
the  Saliency  Network.  To  retrieve  the  code,  ftp  to 
“ftp.ai.mit.edu,”  then  log  in  as  “anonymous,”  then  cd  to 
“pub/users/tda/,”  and  then  get  and  uncompress  “susal- 
code.tar.Z.” 
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